Explanation template for CAST_Gray
logo



Introduction


This markdown document is designed and authored to illustrate the concepts, principles and methods used in CAST, a Probablistic Decline Curve Analysis automation script and user interface. The CAST program has been coded in R Script and the user interface has been built in Spotfire to take advantage of its user-friendly visualizations and code integration. This document is meant to explain what the CAST code is doing to the background to better understand the outputs that can be viewed in Spotfire.

Background


Some background information and principles used within the following markdown document will now be outlined. Skip to the next section to begin the illustrations.. Or something like that? > Define ARPS models that have been historically used and which we will evaluate in this document to illustrate PDCA

PDCA - define and explain ARPS and why we would like the PDCA methodology integrated into our workflows.. How do we go about executing PDCA?

Explain Bayesian theory briefly here.. We want a distribution of the forecasts, but don’t have the forecasts to make this from.. Bayesian theory plays part here. Can estimate the Posterior of a distribution by random sampling from a prior distribution..

To explain the principles, let’s look at a sample public dataset of a few wells in Alberta

Sample Dataset


To illustrate the CAST methods, let’s look at a smaple dataset of a few wells. The sample data contains both oil and gas rates for each producing month taken from the public domain. Let’s look at the rate-time plots for both streams of the sample data.

Production plots

Looking at the production profiles for each sample well, we can summarize key parameters for each well. Months on production, initial oil and gas rates, peak oil and gas rates, peak month and date information, and cumulative volumes for each phase are all important parameters used in created an ARPs decline. Let’s summarize these parameters for the oil stream.

Summary Table

UWI OnProd Date Initial Oil Rate Peak Oil Rate Peak Month Cum Oil
100/03-17-081-16W6/00 2018-04-01 11.951 232.730 3 905924.2
100/05-17-081-16W6/00 2018-04-01 26.418 328.338 3 1839089.1
100/11-17-081-16W6/00 2018-05-01 218.892 544.714 2 2728403.2
100/12-17-081-16W6/00 2018-05-01 265.438 538.424 2 2338572.9
102/05-17-081-16W6/00 2018-04-01 6.919 406.334 3 1134906.0
102/12-17-081-16W6/00 2018-05-01 135.235 439.671 2 1704462.9
103/05-17-081-16W6/00 2018-04-01 7.548 378.658 3 1660988.3


These summary parameters will be re-visited in the extra learning functionality of CAST to prioritize which matches will be completed first.

Decline Models


Modified Hyperbolic

Let’s define the modified hyperbolic equation in executable form. The equation will accept decline parameters and a time series across which to evaluate production rates. First we’ll need a function to convert secant decline percentage to nominal percentage, and vice versa ,then we will define the modified hyperbolic function as “MH”.

## to.nominal function converts effective decline rates to nominal decline rates
to.nominal<-function(De,b=0){
  if(b!=0){
    return((((1 - De)^(-b))-1)/b)
  } else{
    return(-log1p(-De))
  }
}

## to.secant function converts nominal decline rates to effective secant decline rates
to.secant <- function(a,b=0){
  if(b!=0){
    return(1-(1+a*b)^(-1/b))
  } else {
    return(1-exp(-a))
  }
}

##  MH function is the modified hyperbolic function, where a single b value transitions 
##  to an exponential decline at a limiting decline rate
# Function will accept equation constants and an elapsed time value in days
# Output will be rate at given time for given parameters
MH<-function(qi, b = 1.5, Di = 0.7, Dlim = 0.07, q.abd = 3 ,t = c("Enter In Days")){
  #qi must be entered as a rate
  #Di must be decimal in effective secant format as % year, less than 1
  #t must be entered as days
  
  #Convert Di and Dlim to nominal
  ai <- to.nominal(Di,b)
  alim <- to.nominal(Dlim)
  
  #Convert t to yearly time fraction
  t <- t/365.25
  
  #Add error handlers for true exponential decline and for if intial parameters
  # bypass the exponential decline function.. I.e. if di<dlim.. error
  #If di > dlim, use Di and no Dlim until end of time
  if(b<1e-10){
    
    q <- qi * exp(-ai*t)
    
  } else{
    #Calculate limiting decline rate
    qlim <- qi*(alim/ai)^(1/b)
    
    #Calculate limiting decline time, in years
    tlim <- (((qi/qlim)^b)-1)/(b*ai)
    
    #Calculate q assuming hyperbolic
    q<-qi/((1+b*ai*t)^(1/b))
    
    #Replace q when t > tlim
    q[t > tlim] <- qlim*exp(-alim*(t[t>tlim] - tlim))
    
  }
  
  #Abandonment cutoff. When q < q.abn, q = 0
  q[q < q.abd] <- 1e-10
  
  return(q)
}

Multi-segment

Now we’ll define a simple multi-segment model in executable form. The multi-segment equation will accept the same parameters as the modified hyperbolic equation, but will have an additional hyperbolic segment before hitting the limiting decline rate. This secondary hyperbolic segment will be defined by bf (usually set to 0.5 to indicate boundary dominated flow) and the transition from the first to the second segment will occur at time to end of linear flow, or telf. This transition will maintain the decline rate prior to telf, but will alter the rate at which it changes. The multi-segment equation function will be defined as “MS”, and will call MH within.

## MS function is the multi-segment (two segment) version of the modified hyperbolic equation.
## The equation will transition from bi to bf and finally to an exponential decline.
#  Function will accept all required input parameters, abandonment rates, and a time vector
#  in days, returning a rate vector across the input time interval. 
#  MH will be called within after initial segment has been populated.

MS<-function(qi, bi = 1.5, Di = 0.7, bf = 0.5, Dlim = 0.07, telf = 365.25,
             q.abd = 3 ,t = c("Enter In Days")){
 
  #qi must be entered as a rate
  #bi must be greater than bf
  #Di must be a decimal in secant form as % per year
  #t and telf must be entered in days
  
  # First convert Di and Dlim to nominal
  #Convert Di and Dlim to nominal
  ai <- to.nominal(Di,bi)
  alim <- to.nominal(Dlim)

  # Convert t to yearly time fraction
  t_norm <- t/365.25
  
  # Find index of t closest to telf
  telf_ind <- which.min(abs(t-telf))
  
  #TODO: Add error handling here for if bi < bf, etc..
  
  # Find qlim and tlim of initial decline segment
  qlim <- qi*(alim/ai)^(1/bi)
  tlim <- (((qi/qlim)^bi)-1)/(bi*ai)
  
  # If tlim is < telf, only one segment to dlim, call MH.
  if(tlim <= t[telf_ind]){
    # One segment
    q <- MH(qi,bi,Di,Dlim,q.abd,t)
  } else {
    # Two segment
    # Call MH to Calc q for segment one, q_one from t to telf
    t1 <- t[1:telf_ind]
    q1 <- MH(qi,bi,Di,Dlim,q.abd,t = t1)
    
    # Find instantaneous nominal D at telf; aelf
    aelf <- (1/tail(q1,1))*(tail(q1,2)[1]-tail(q1,1))/(tail(t1)[1]-tail(t1))
    #FIXME: DOUBLE CHECK WHETHER THIS TIME SHOULD BE NORMALIZED OR NOT: /365.25...
    
    # Calculate and normalize second time vector; needs to start at 0
    t2 <- t[-(1:telf_ind)] - t[-(1:telf_ind)][1]
    
    # Call MH to populate q2
    q2 <- MH(tail(q1,1),bf,to.secant(aelf,bf),Dlim,q.abd,t2)
    
    # Combine the two rate vectors
    q <- c(head(q1,-1),q2)
  }
  
  return(q)
}

DCA


Now that we’ve defined the executables of the models we need, how can we automate the fitting of an ARPs curve to a sample production dataset? A methodology for this usually consists of taking a range of input values and evaluate potential matches until the match with the least error is found. To accomplish this we can use a Monte Carlo sampling method to sample 10,000 parameter sets from the input distribution space and evaluate each set against the actual production data. Using a Monte Carlo will not only illustrate the error functions we will use, but will also be used as the initialization phase for the CAST probabilistic process.

Monte Carlo

Focusing on the Modified Hyperbolic equation for oil, let’s first define an input parameter space within which the solution should exist. To keep it simple, we’ll use uniform distributions for each parameter, contained in a DefineDistribution class object.


Now the only parameter that is missing from the solution space is qi. This is obviously dependent on the well we are looking to match, so let’s take the 100/05-17-081-16W6/00 well from the sample dataset as our example well. We will define the qi solution space a little differently, using a uniform distribution bounded by peak rate multipliers to keep the samples within reason.


To understand the input distributions, let’s visualize them. We’ll randomly sample 10,000 iterations from each DefineDistribution class object and plot the results of each parameter input.

Uniform Distributions of b, di and qiUniform Distributions of b, di and qiUniform Distributions of b, di and qi

Uniform Distributions of b, di and qi


Production Model Synthesis

With the solution space defined, we now need to compare each sample set to the actual production to find the best fit. We’ll need to synthesize production rates for each sample set by substituting them into the Modified Hyperbolic equation over a given time period.

Combined Sampled Parameters

Let’s combine and organize our sampled parameters in a data frame. We’ll use an equation-specific function to ensure completeness and repeatability for multi-well matching. This function will apply different methods based on the class of the input parameters list.

qi b di dlim
460.3626 0.5439358 0.8949809 0.07
605.3006 1.5731898 0.8582482 0.07
369.3622 1.0781440 0.5381730 0.07
178.9772 1.6942012 0.8060896 0.07
806.5430 0.9702751 0.9495410 0.07
732.3554 1.9068105 0.6602459 0.07
541.6230 1.4253972 0.5928134 0.07
791.5670 0.8606908 0.8162535 0.07
970.6170 1.2631671 0.5439451 0.07
685.9738 1.5790421 0.7175469 0.07

First 10 rows of modified hyperbolic parameters dataframe

Historical Rate Synthesis

Using the combined parameter dataframe and the modified hyperbolic equation, we can define a function to use each row of the parameter dataframe as a Monte Carlo iteration and substitute it into the arps equation across the time interval for the sample well. The function will create and return a mcIteration object for each iteration. This object is a list with slots for Iteration name, Parameters, and Synth. Some additional slots will be created and populated later as we approach cost and forecasting.


To properly synthesize production data for the sample well, we need to create a time vector from the peak oil rate to the last producing month.

## $Iteration
## [1] "qi=460.36 b=0.54 di=0.89 dlim=0.07"
## 
## $Parameters
##          qi           b          di        dlim 
## 460.3625692   0.5439358   0.8949809   0.0700000 
## 
## $Synth
##  [1] 460.36257 330.41732 247.35453 192.62467 155.60928 127.75619 107.50162
##  [8]  91.34834  78.65800  69.38956  60.96063  54.21876  48.38986  43.62112
## [15]  39.41518  35.80406  32.77353  30.03893  27.71393  25.59101  23.70910
## [22]  22.13517  20.62432  19.30858  18.08037

Each mcIteration list, defining a Monte Carlo iteration and historical synthesis, can now be evaluated against the actual data to search for an appropriate model.

Loss Functions

To compare each synthesis to the historical production data observed, we need to use functions that compare predicted points to actuals for each model. These functions are commonly referred to as loss functions. The most common loss function is the L2 loss function, which is popular due to its easy implementation, especially in linear models. However, the L2 loss is not as useful when comparing data with significant outliers, such as oil and gas production data, because it assigns too much weight to these outliers. We will need to look at more robust loss functions to find models that match the historical data as an engineer would.

Soft L1

Soft L1 loss is a robust, continuous variation of the common Huber Loss that gives less weight to outliers than a L2 loss, making it useful for fitting time series production data. Our Soft L1 definition will return the individual loss of each point and an average cost for all residuals input.


To apply the loss equation we’ll define a function to apply the Soft L1 loss to the residuals of each model match. Due to the non-linear nature of production data, the loss function will be applied to the log residuals. The function will populate the IndvCost and Cost slots of each mcIteration object. For this illustration, the function will be defined in the background and we can see the result for the first iteration object.

## $Iteration
## [1] "qi=460.36 b=0.54 di=0.89 dlim=0.07"
## 
## $Parameters
##          qi           b          di        dlim 
## 460.3625692   0.5439358   0.8949809   0.0700000 
## 
## $Synth
##  [1] 460.36257 330.41732 247.35453 192.62467 155.60928 127.75619 107.50162
##  [8]  91.34834  78.65800  69.38956  60.96063  54.21876  48.38986  43.62112
## [15]  39.41518  35.80406  32.77353  30.03893  27.71393  25.59101  23.70910
## [22]  22.13517  20.62432  19.30858  18.08037
## 
## $IndvCost
##  [1] 0.3134137610 0.2965712496 0.3087135803 0.3350164450 0.2721060015
##  [6] 0.2146343907 0.0235178085 0.0433602151 0.0245615456 0.0119766911
## [11] 0.0008584473 0.0732817531 0.0628184311 0.0506526532 0.1212306678
## [16] 0.1705506027 0.1206529242 0.1697310024 0.2106775721 0.2816378928
## [21] 0.2649857987 0.2121464179 0.2548659827 0.2265560757 0.2848290830
## 
## $Cost
## [1] 0.1739739


Using the cost function applied to each iteration, we can minimize the cost to find our bestfit model. Using this bestfit, we can also visualize the cost model results for each production point in the fit.


Using the log residuals and the Soft L1 cost function, let’s find the top 100 models from our initial 10,000 iterations and see how the matches vary from the bestfit. We’ll colour them by average cost to compare the forecasts.

Top 1% of Models Using Soft L1 Loss

Top 1% of Models Using Soft L1 Loss


We can see that the top 100 forecasts do a good job of covering the range of actual oil rates, with the lowest cost forecasts tightly grouped around the actuals. However, one observation is that the forecasts are not evenly distributed above and below the actuals to appropriately capture uncertainty in the forecasts. Let’s explore another loss function to see if we can improve this behaviour.

PDCA Loss

The Soft L1 loss function does a decent job at finding a best fit model, but is there a better loss function that can be used? Let’s look at a loss function defined in SPE-174784-PA (ADD LINK HERE AS REFERENCE) by Fulford et al. that takes a similar approach, but is designed to capture more uncertainty in the production data. This loss function uses the average and standard deviation of the individual cost values across the fit, comparing them to a baseline fit. This will prove particularily useful when we move into the PDCA methods later in the CAST program.

For now, let’s define this loss function as PDCA Loss and see how it improves the fits from our Monte Carlo.


Just as we did with the Soft L1, let’s use the PDCA loss function to find the top 100 forecasts from the same 10,000 iterations. We’ll set the minimum mean and minimum standard deviation of epsilon to 0, assuming a perfect match. We’ll colour by average cost and compare to the Soft L1 method.

Comparing Top 1% of Models Using Soft L1 and PDCA LossComparing Top 1% of Models Using Soft L1 and PDCA Loss

Comparing Top 1% of Models Using Soft L1 and PDCA Loss


We can see that the PDCA Loss function has improved the distribution of fits above and below the actuals, a behaviour that we would like to carry forward because it helps to describe a high and low fit, as well as a bestfit. Lastly, let’s look at the parameter distributions for the top 1% of models to get an idea of the solution space using the PDCA Loss method.

Parameters of P1 model matches using PDCA Loss methodParameters of P1 model matches using PDCA Loss methodParameters of P1 model matches using PDCA Loss method

Parameters of P1 model matches using PDCA Loss method

Summarize observations from these distributions.. They are mostly lognormal. Therefore, do we need to step into MCMC to get a proper PDCA distribution going forward? Keep these in mind and compare later on..

Summarize findings in these two charts and how we’re going to carry PDCA loss going forward in the rest of the evaluation / process as it provides a better probabilistic representation. Show distribution fits of the accepted parameters? Or do this at a later time? Or here just FYI??

Forecasting

Using Monte Carlo methods and loss functions to automate the DCA matching based on historicals, we can now take the top 100 models to the next step: forecasting. For each of the accepted models, we will synthesize monthly production rates for the remaining life of the sample well until an abandonment rate of 3 bbls/d is reached.


Visualizing each forecast we can see the distribution appears to summarize our actual data quite well. In a way, we’ve taken the first step towards PDCA. We’ve used a variety of accepted models to define possible oil rate outcomes for the sample well. To illustrate this better, let’s compile the forecasts into P10, Mean, P50, and P90 curves to represent all the forecasts.

Aggregation

The most obvious way to aggregate the individual forecasts would be to simply find the percentile rates at each monthly time interval that represent the P10, Mean, P50 and P90. While this works well in theory, in practice we run into an issue where these aggregations have no single ARPs definition and therefore cannot be used outside of this space. Another approach to consider is using EURs from each of these forecasts to create a probabilistic bin within which we can find the average ARPS equation. Since the correlation between 30 year EUR and ARPs parameters is fairly linear, this method works well. As a proof we will illustrate both methods side-by-side.

Comparison of Aggregation Methods


Both charts show similar results. We can see the P10, P50, and P90 forecasts in dashed lines, overlaid with the mean in solid black. The EUR bin method - which takes a +/- 0.5 percentile interval around the P10, Mean, P50, and P90 - looks to match the actual percentile aggregation well. This illustrates a small sample PDCA output, where we have summarized the range of possible outcomes into percentile forecasts. This has worked well using the Monte Carlo method due to the sample well having clean production data, but this will not always be the case. We will need to investigate other methods to better incorporate a relative view on uncertainty.

PDCA


Now that we’ve dipped our toes into PDCA, we might as well dive straight in. Revisiting some of the concepts illustrated earlier in this markdown, we’ll see how taking our probabilistic methodology to the next level can take the Monte Carlo PDCA abbreviation and transform it into a robust model that can cope with poor initial fits.

Markov Chain Monte Carlo (MCMC)

To achieve the desired PDCA outcome that covers the range of uncertainty in production data for all varieties of production data, we will need to incorporate a Markov Chain Monte Carlo (MCMC) method. This will allow us to recycle many of principles we’ve previously applied in the Monte Carlo method above, but add important functionality to further refine the resulting probabilistic forecasts. As part of the MCMC method, once we reach a bestfit approximation of the production data in the burn-in phase (here satisfied by our Monte Carlo minimization), the algorithm will then jump through the solution space. Each jump will be evaluated against the last using the PDCA Loss function defined above to determine whether or not to accept or reject the jump. If accepted, the jump is added to the view of the posterior distribution and the next step start intiates from it. If rejected, a new jump is sampled and evaluated. This is repeated for 10,000 iterations until the posterior distribution is defined. Using the same sample well, we can illustrate the MCMC method as used in the CAST program.

Burn-In

The burn-in process is an initial convergence step that allows the MCMC to start from an equilibirum position, in this case the bestfit from the 10,000 Monte Carlo runs. While the burn-in process is often described as not needed or biased, in the case of PDCA we would like maintain its functionality to give us a baseline fit for evaluation of the PDCA loss function. In plain words, there is more uncertainty in production data that is hard to fit a curve to rather than data that is easy to fit. In this way, we can honour that higher level of uncertainty when needed.

Metropolis-Hastings Algorithm

With the bestfit defined as the starting point, we can start sampling through the solution space to carry out the MCMC process. The Metropolis-Hastings algorithm is a method for obtaining a sequence of random samples from multi-dimensional distributions like we have in PDCA. Simply defined, the Metropolis-Hastings method starts at an arbitrary multi-dimensional point (the start of the Markov Chain) and randomly chooses the next multi-dimensional point to evaluate given jumping distributions for each dimension. In our case the inputs parameters for the MH ARPs equation make up each dimension. For each iteration, the candidate sample is compared to the previous sample using an acceptance ratio. If this ratio is greater than 1, the jump is automatically accepted as being more probable than the previous sample. If the ratio is less than 1, the point is less probable than the previous sample, but can still be accepted. Using a random a random probability, sometimes the move will be accepted and sometimes it will be rejected, with acceptance ratios closer to 1 being given a higher probability of acceptance. Thus, we will tend to stay in high density regions and define the desired distribution after sufficient iterations.

The jumping distributions we will use are defined as follows:

Use some formulas to describe the N(i-1,SD) of each jumping distribution


Applying these jumping distributions for 10,000 iterations on our sample well, we can follow the chain of accepted or rejected iterations as the posterior distribution is defined.



Let’s also look at the accepted distributions of each input parameter.

Accepted input parameters using MCMC methodAccepted input parameters using MCMC methodAccepted input parameters using MCMC method

Accepted input parameters using MCMC method


The resulting accepted distributions are dependent on the jumping distributions, which need to be tuned appropriately, but we can see the posterior distributions take shape. With more samples than the Monte Carlo illustration, each distribution is well defined, following normal or lognormal distributions.

Add note about staying in the proper solution space (prior distributions) through out entire MCMC process - this is crucial to have the equations make physical sense

Results

Let’s compare the top Monte Carlo matches to the MCMC matches to see how the accepted solution space matches the oil rate behaviour.

Comparing Top 1% Monte Carlo Matches to MCMC MatchesComparing Top 1% Monte Carlo Matches to MCMC Matches

Comparing Top 1% Monte Carlo Matches to MCMC Matches


We can see that the matches have a tighter spread around the actuals, returning more forecasts within the acceptable solution space. The MCMC and Metropolis-Hastings algorithm method has achieved the desired result, returning thousands of acceptable models that define the uncertainty in forecasting the sample well. Using the same methods derived earlier in the markdown, let’s compile the accepted forecasts into P10, Mean, P50, and P90 durves to summarize the range of outcomes.


Finally, let’s summarize the EUR distribution and the EURs represented by the probabilistic forecasts.

qi b di dlim EUR
P10 347.1222 0.9314419 0.8439423 0.07 43857.60
Mean 310.9443 0.8345886 0.8358487 0.07 31777.33
P50 314.2023 0.8255995 0.8389456 0.07 30477.70
P90 279.9854 0.7267476 0.8317668 0.07 21270.16


Multi-segment Model MCMC

Let’s briefly run the MCMC process using a multi-segment (MS) model as defined previously to look at how decline model can impact the EUR estimates. For our comparison, we’ll define a multi-segment model with the same main input distributions as the MH model, but adding bf and telf input distributions. For simplicity, they will be set to constant values of 0.5 and 12 months respectively.



### Forecast Comparison Let’s look at how the top 100 Monte Carlo model forecasts compare to the MH model.

Forecast comparison of MH (left) and MS (right) models



Need to verify the MS model and update these charts…

PDCA Results Comparison

Now let’s compare results from the MCMC process for the multi-segment model to that of the MH model, looking at the PDCA forecasts and EUR histogram.

MCMC results comparison of MH (left) and MS (right) modelsMCMC results comparison of MH (left) and MS (right) models

MCMC results comparison of MH (left) and MS (right) models



MCMC EUR comparison of MH (left) and MS (right) modelsMCMC EUR comparison of MH (left) and MS (right) models

MCMC EUR comparison of MH (left) and MS (right) models



qi bi di bf telf dlim EUR
MH P10 347.1222 0.9314419 0.8439423 NA NA 0.07 43857.60
MH Mean 310.9443 0.8345886 0.8358487 NA NA 0.07 31777.33
MH P50 314.2023 0.8255995 0.8389456 NA NA 0.07 30477.70
MH P90 279.9854 0.7267476 0.8317668 NA NA 0.07 21270.16
MS P10 380.9841 0.9328747 0.8469293 0.5 12 0.07 48311.18
MS Mean 338.8211 0.8355867 0.8406593 0.5 12 0.07 34118.79
MS P50 337.6327 0.8170564 0.8381209 0.5 12 0.07 33158.96
MS P90 298.2298 0.6951849 0.8286856 0.5 12 0.07 21867.97

CAST


Use this section to show what extra outputs the CAST program gives.. how it matches distributions using fitdistrplus and outputs, and outputs the ARPs parameters for the PDCA matches, as well as exports their rate-time data for plotting within the program How to proceed next? Show refined probabilistic forecasts here? Then step into section titled CAST extras that shows what makes some of the CAST algorithms unique? kNN automatic outlier detection, different model matches, learning, enhanced peak detection, etc.?

Sub-headings: 1. Muli-segment 1. Outputs 2. Extras 2.1 Outlier detection 2.2 Learning etc etc..

 




Darcy Redpath
Petronas Canada